Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add data checks in client.py #30

Merged
merged 9 commits into from
Sep 22, 2024
Merged

Add data checks in client.py #30

merged 9 commits into from
Sep 22, 2024

Conversation

Sathya98
Copy link
Collaborator

@Sathya98 Sathya98 commented Jul 4, 2024

Change Description

Try to be precise. You can additionally add comments to your PR, this might help the reviewer a lot.

Added a function check_training_data that checks the X and y tables for basic integrity and length inconsistencies before sending them over to the tabpfn server

Also added a unit test in test_client.py that tests the function with the breast cancer dataset

If you used new dependencies: Did you add them to requirements.txt?

Who did you ping on Mattermost to review your PR? Please ping that person again whenever you are ready for another review.

Breaking changes

If you made any breaking changes, please update the version number.
Breaking changes are totally fine, we just need to make sure to keep the users informed and the server in sync.

Does this PR break the API? If so, what is the corresponding server commit?

Does this PR break the user interface? If so, why?


Please do not mark comments/conversations as resolved unless you are the assigned reviewer. This helps maintain clarity during the review process.

@Sathya98 Sathya98 requested a review from liam-sbhoo July 4, 2024 13:03
@Sathya98 Sathya98 self-assigned this Jul 4, 2024
@liam-sbhoo
Copy link
Collaborator

Can you ping me again when you added the MAX_ROWS and MAX_COLS we mentioned in the meeting? :)

@Sathya98
Copy link
Collaborator Author

Sathya98 commented Jul 4, 2024 via email

@liam-sbhoo
Copy link
Collaborator

Please also setup your environment for ruff formatting according to Development section of the README. With this, you'll probably pass the failed checks.

@liam-sbhoo
Copy link
Collaborator

Good to go after the slight refactor in test and correcting the format. Thanks!!

@Sathya98
Copy link
Collaborator Author

Good to merge now, reformatted accordingly!

@liam-sbhoo
Copy link
Collaborator

liam-sbhoo commented Sep 21, 2024

@Sathya98
I did some modification here:

  • moving the data size check to estimator, instead of ServiceClient (responsibility-wise is clearer)
  • add more test cases

@SamuelGabriel can you help to review?

Copy link
Collaborator

@SamuelGabriel SamuelGabriel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one tiny little change I would request than it is good to go!

tabpfn_client/estimator.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@SamuelGabriel SamuelGabriel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@liam-sbhoo liam-sbhoo merged commit 2329e70 into main Sep 22, 2024
2 checks passed
@liam-sbhoo liam-sbhoo deleted the add_data_checks branch September 22, 2024 09:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants